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ABSTRACT 

Motivated by the problem of simultaneously preserving con¬ 
fidentiality and usability of data outsourced to third-party 
clouds, we present two different database encryption schemes 
that largely hide data but reveal enough information to sup¬ 
port a wide-range of relational queries. We provide a secu¬ 
rity definition for database encryption that captures con¬ 
fidentiality based on a notion of equivalence of databases 
from the adversary’s perspective. As a specific application, 
we adapt an existing algorithm for finding violations of pri¬ 
vacy policies to run on logs encrypted under our schemes 
and observe low to moderate overheads. 

1. INTRODUCTION 

To reduce infrastructure costs, small- and medium-sized 
businesses may outsource their databases and database ap¬ 
plications to third-party clouds. However, proprietary data 
is often private, so storing it in a cloud raises confidential¬ 
ity concerns. Client-side encryption of databases prior to 
oursourcing alleviates confidentiality concerns, but it also 
makes it impossible to run any relational queries on the 
outsourced databases. Several prior research projects have 
investigated encryption schemes that trade-off perfect data 
confidentiality for the ability to run relational queries [SEE 
EH]. However, these schemes either require client-side pro¬ 
cessing [21] , or require additional hardware support S , or 
support a very restrictive set of queries [39]. Our long-term 
goal is to develop database encryption schemes that can (1) 
run on commodity off-the-shelf (COTS) cloud infrastructure 
without any special hardware or any kernel modifications, 

(2) support a broad range of relational queries on the en¬ 
crypted database without interaction with the client, and 

(3) provide provable end-to-end security and a precise char¬ 
acterization of what information encryption leaks for a given 
set of queries. Both in objective and in method, our goal is 
similar to that of CryptDB [33], which attains properties (1) 
and (2), but not (3). 


As a step towards our goal, in this paper, we design two 
database encryption schemes, Eunomia DET and Eunomia KH , 
with properties (1), (2) and (3). Our design is guided by, 
and partly specific to, a single application domain, namely, 
audit of data-use logs for violations of privacy policies. This 
application represents a real-world problem. For example, 
in the US, the healthcare and finance industry must handle 
client data in accordance with the privacy portions of the 
federal acts Health Insurance Portability and Accountabil¬ 
ity Act (HIPAA) [23] and Gramm-Leach-Bliley Act (GLBA) 
[37| respectively, in addition to state legislation. To remain 
compliant with privacy legislation, organizations record logs 
of privacy-relevant day-to-day operations such as data ac¬ 
cess/use and employee role changes, and audit these logs 
for violations of privacy policies, either routinely or on a 
case-by-case basis. Logs can get fairly large and are often 
organized in commodity databases. Further, audit is com¬ 
putationally expensive but highly parallelizable, so there is 
significant merit in outsourcing the storage of logs and the 
execution of audit algorithms to third-party clouds. 

Generality. Audit is a challenging application for encryp¬ 
tion schemes because audit requires almost all relational 
query operations on logs. These operations include selec¬ 
tion, projection, join, comparison of fields, and what we call 
displaced comparison (is the difference between two fields 
less than a given constant?). Both our encryption schemes 
support all these query operations. The only standard query 
operation not commonly required by privacy audit (and not 
supported by our schemes) is aggregation (sums and av¬ 
erages; counting queries are supported). Any application 
that requires only the query operations listed above can be 
adapted to run on Eunomia DET or Eunomia KH , even though 
this paper focuses on the audit application only. 

Eunomia DET and Eunomia KH trade efficiency and flexibility 
in supported queries differently. Eunomia DET uses determin¬ 
istic encryption and has very low overhead (3% to 9% over a 
no encryption baseline in our audit application), but requires 
anticipating prior to encryption which pairs of columns will 
be join-ed in audit queries. Eunomia KH uses Popa ef al.’s 
adjustable key hash scheme [341 133] for equality tests and 
has higher overhead (63% to 406% in our audit applica¬ 
tion), but removes the need to anticipate join-ed columns 
ahead-of-time. 

Security. We characterize formally what information about 
an encrypted log (database) our schemes may leak to a PPT 


adversary with direct access to the encrypted store (mod¬ 
eling a completely adversarial cloud). We prove that by 
looking at a log encrypted with either of our schemes, an 
adversary can learn (with non-negligible probability) only 
that the plaintext log lies within a certain, precisely defined 
equivalence class of logs. This class of logs characterizes the 
uncertainty of the adversary and, hence, the confidentiality 
of the encrypted log [3]. Prior work like CryptDB lacks such 
a theorem. CryptDB uses a trusted proxy server to dynam¬ 
ically choose the most secure encryption scheme for every 
database column (from a pre-determined set of schemes), 
based on the queries being run on that column. While each 
scheme is known to be secure in isolation and it is shown that 
at any time, a column is encrypted with the weakest scheme 
that supports all past queries on the column [?, Theorem 2], 
there is no end-to-end characterization of information leaked 
after a sequence of queries. (In return, CryptDB supports 
all SQL-queries, including aggregation queries, which we do 
not.) 

Functionality. To demonstrate that our encryption schemes 
support nontrivial applications, we adapt an audit algorithm 
called reduce from our prior na to run on logs encrypted 
with either Eunomia DET or Eunomia KH . We implement and 
test the adapted algorithm, ereduce, on both schemes and 
show formally that it is functionally correct on Eunomia DET . 

Since ereduce exercises all the log-query operations listed 
above, this is strong evidence that our schemes support 
those operations correctly. To run ereduce on Eunomia DET 
(Eunomia KH ) we need to know prior to encryption (audit) 
which columns in the log will be compared for equality or 
joined. To this end, we develop a new static analysis of 
policies, which we call the EQ mode check. 

Privacy audit often requires comparisons of the form “is 
timestamp tl within 30 days of timestamp t2”? We call 
such comparisons “displaced comparisons” (30 days is called 
the “displacement”). To support displaced comparisons, we 
design and prove the security of a new cryptographic sub¬ 
scheme dubbed mOPED (mutable order-preserving encod¬ 
ing with displacement). This scheme extends the mOPE 
scheme of Popa et al. [32] , which does not support displace¬ 
ments, and may be of independent interest. 

Deployability. Both Eunomia DET and Eunomia KH can be 

deployed on commodity cloud database systems with some 
additional metadata. In both schemes, the client encrypts 
individual data cells locally and stores the ciphertexts in 
a commodity database system in the cloud (possibly incre¬ 
mentally). Audit runs on the cloud without interaction with 
the client and returns encrypted results to the client, which 
can decrypt them. The schemes reveal enough information 
about the data to perform all supported operations, e.g., 
compare two data values for equality to perform equi-joins. 

Contributions. We make the following technical contribu¬ 
tions: 

• We introduce two database encryption schemes, Eunomia DET 
and Eunomia KH , that support selection, projection, join, 
comparison of fields, and displaced comparison queries. 

The schemes trade efficiency for the need to predict ex¬ 
pected pairs of join-ed columns before encryption. As 


a building block, we develop the sub-scheme mOPED, 
that allows displaced comparison of encrypted values. 

• We characterize confidentiality preserved by our schemes 
as equivalence classes of plaintext logs and prove that 
both our schemes are end-to-end secure. 

• We adapt an existing privacy policy audit algorithm to 
execute on our schemes. We prove the functional cor¬ 
rectness of the execution of our algorithm on Eunomia DET . 

• We implement both our schemes and the adapted audit 
algorithm, observing low overheads on Eunomia DET and 
moderate overheads on Eunomia KH . 


Notation. This paper is written in the context of the pri¬ 
vacy audit application and our encryption schemes are pre¬ 
sented within this context. Accordingly, we sometimes use 
the term “log” or “audit log” when the more general term 
“database” could have been used and, similarly, use the term 
“policy” or “privacy policy” when the more general term 
“query” could have been used. We expect our schemes to 
generalize to other database applications straightforwardly. 

2. OVERVIEW OF EUNOMIA 

We first present the architecture of Eunomia. Then, we 
motivate our choice of encryption schemes through examples 
and discuss policy audit in Eunomia in more detail. Finally, 
we discuss our goals, assumptions, and adversary model. 

2.1 Architecture of Eunomia 

We consider the scenario where an organization, called 
the client or Cl, with sensitive data and audit requirements 
wishes to outsource its log (organized as a relational database) 
and audit process (implemented as a sequence of policy- 
dependent queries) to a potentially compromisable third- 
party cloud server, denoted CS. Cl generates the log from its 
day-to-day operations. Cl then encrypts the log and trans¬ 
fers the encrypted log to the CS. Cl initiates the audit pro¬ 
cess by choosing a policy. The auditing algorithm runs on 
the CS infrastructure and the audit result containing en¬ 
crypted values is sent to Cl, which can decrypt the values in 
the result. 

The mechanism of log generation is irrelevant for us. From 
our perspective, a log is a pre-generated database with a 
public schema, where each table defines a certain privacy¬ 
relevant predicate. For example, the table Roles may con¬ 
tain columns Name and Role, and may define the mapping of 
Cl’s employees to Cl’s organizational roles. Similarly, the ta¬ 
ble Sensitive_accesses may contain columns Name, File_name, 
and Time, recording who accessed which sensitive file at 
what time. Several tables like Sensitive_accesses may con¬ 
tain columns with timestamps, which are integers. 

2.2 Encryption Schemes 

An organization may naively encrypt the entire log with 
a strong encryption scheme before transferring it to a cloud, 
but this renders the stored log ineffective for audit, as audit 
(like most other database computations) must relate differ¬ 
ent parts of the log. For example, suppose the log contains 
two tables T\ and T 2 . Tj lists the names of individuals who 


accessed patients’ prescriptions. T 2 lists the roles of all in¬ 
dividuals in the organization. Consider the privacy policy: 

Policy 1: Every individual accessing patients’ prescriptions 
must be in the role of Doctor. 

The audit process of the above policy must read names 
from Ti and test them for equality against the list of names 
under the role Doctor in T 2 . This forces the use of an en¬ 
cryption method that allows equality tests (or equi-joins). 
Unsurprisingly, this compromises the confidentiality of the 
log, as an adversary (e.g., the cloud host, which observes the 
audit process) can detect equality between encrypted fields 
(e.g., equality of names in Ti and T 2 ). However, not all is 
lost: for instance, if per-cell deterministic encryption is used, 
the adversary cannot learn the concrete names themselves. 

A second form of data correlation necessary for audit is the 
order between time points. Consider the following policy: 

Policy 2: If an outpatient’s medical record is accessed by 
an employee of the Billing Department, then the outpatient 
must have visited the medical facility in the last one month. 

Auditing this policy requires checking whether the dis¬ 
tance between the timestamps in an access table and the 
timestamps in a patient visit table is shorter than a month. 

In this case, the encryption scheme must reveal not just 
the relative order of two timestamps but also the order be¬ 
tween a timestamp and another timestamp displaced by one 
month. Similar to Policy 1, the encryption scheme must 
reveal equality between patient names in the two tables. 

To strike a balance between functional (audit) and con¬ 
fidentiality requirements, we investigate two cryptographic 
schemes, namely Eunomia DET and Eunomia K , to encrypt 
logs. Each cell in the database tables is encrypted indi¬ 
vidually. All cells in a column are encrypted using the same 
key. Eunomia DET uses deterministic encryption to support 
equality tests; two columns that might be tested for equal¬ 
ity by subsequent queries are encrypted with the same key. 
Eunomia DET requires that log columns that might be tested 
for equality during audit are known prior to the encryption. 
Audit under Eunomia DET is quite efficient. However, adapt¬ 
ing encrypted logs to audit different policies that require 
different column equality tests requires log re-encryption, 
which is costly. Our second scheme Eunomia KH handles fre¬ 
quent policy updates efficiently. Eunomia KH relies on the ad¬ 
justable key hash scheme [34l l33l for equality tests. A trans¬ 
fer token is generated for each pair of columns needed to be 
tested for equality prior to audit. Eunomia KH additionally 
stores keyed hashes of all cells. Audit under Eunomia KH re¬ 
quires the audit algorithm to track the provenance of the ci¬ 
phertext (i.e., from which table, which column the ciphertext 
originated) and is less efficient than audit under Eunomia DET . 

To support timestamp comparison with displacement (shown 
in Policy 2) the mOPED scheme, described in Section [4j 
is used by both Eunomia DET and Eunomia KH . Like its prede¬ 
cessor, mOPE [32], the scheme adds an additional search 
tree (additional metadata) to the encrypted database on 
CS. (Supporting displacements is necessary for a practi¬ 
cal audit system because privacy regulations use displace¬ 
ments to express obligation deadlines. Out of 84 HIPAA pri¬ 
vacy clauses, 7 use displacements. Cignet Health of Prince 
George’s County, Maryland was fined $1.3 million for vio¬ 
lating one of these clauses, §164.524.) 

The encrypted database has a schema derived from the 
schema of the plaintext database and may be stored on CS 


using any standard database management systems (DBMS). 
The DBMS may be used to index the encrypted cells. As 
shown in E2. database indexing plays a key role in improv¬ 
ing the efficiency of the audit process. Hence, we develop 
our encryption scheme in such a way that it is possible to 
leverage database indexing supported by commodity DBMS. 

2.3 Policies and audit 

Privacy policies may be extracted from privacy legisla¬ 
tion like HIPAA [23] and GLBA [37], or from internal com¬ 
pany requirements. Technically, a privacy policy specifies a 
constraint on the log. For example, Policy 1 of Section [T] 
requires that any name appearing in table Ti appear in ta¬ 
ble T 2 with role Doctor. Generally, policies can be complex 
and may mention various entities, roles, time, and subjec¬ 
tive beliefs. For instance, DeYoung et al.’s formalization of 
the HIPAA and GLBA Privacy Rules span over 80 and 10 
pages, respectively [16]. We represent policies as formulas of 
first-order logic (FOL) because we find it technically conve¬ 
nient and because FOL has been demonstrated in prior work 
to be adequate for representing policies derived from exist¬ 
ing privacy legislation (DeYoung et al., mentioned above, 
use the same representation). We describe this logic-based 
representation of policies in Section [3] 

Our audit algorithm adapts our prior algorithm, reduce 1171 . 
that works on policies represented in FOL. This algorithm 
takes as input a policy and a log and reduces the policy 
by checking the policy constraints on the log. It outputs 
constraints that cannot be checked due to lack of informa¬ 
tion (missing log tables, references to future time points, or 
need for human intervention) in the form of a residual policy. 
Similar to reduce, our adapted algorithm, ereduce, uses 
database queries as the basic building block. Our encryption 
schemes permit queries with selection, projection, join, com¬ 
parison and displaced comparison operations. Our schemes 
do not support queries like aggregation (which would re¬ 
quire an underlying homomorphic encryption scheme and 
completely new security proofs). 

To run reduce on Eunomia DET , we need to identify columns 
that are tested for equality. This information is needed 
prior to encryption for Eunomia DET and prior to audit for 
Eunomia , as explained in Section [4] We develop a static 
analysis of policies represented in FOL, which we call the 
EQ mode check, defined in Section [3 to determine which 
columns may need to be compared for equality when the 
policy is audited. 

2.4 Adversary model and Security Goals 

Assumptions and threat model. In our threat model, Cl 
is trusted but CS is an honest but curious adversary with the 
following capability: CS can run any polynomial time algo¬ 
rithm on the stored (encrypted) log, including the audit over 
any policy. We assume that Cl generates keys and encrypts 
the log with our encryption schemes before uploading it to 
CS. Audit runs on the CS infrastructure but (by design) it 
does not perform decryption. Hence, CS never sees plaintext 
data or the keys, but CS can glean some information about 
the log, e.g., the order of two fields or the equality of two 
fields. The output of audit may contain encrypted values 
indicating policy violations, but these values are decrypted 
only at Cl. 

We assume that privacy policies are known to the adver¬ 
sary. This assumption may not be true for an organization’s 


Atoms 


V 


::= p(ti, ...,t n ) | timeOrder(ti, di, t 2 , d 2 ) | 
ti = t 2 

Guard 3 ::= P | T | _L | 31 A g 2 \ 31 V 32 | 3 a ;.3 

Formula p ::= "P | T | _L | ipi A ip 2 | <pi V </?2 | 

V *.(3 -t p) | 3*. (3 A 33 ) 

Figure 1: Policy specification logic syntax 


internal policies, but relaxing this assumption only simpli¬ 
fies our formal development. To audit over logs encrypted 
with Eunomia DET , any constants appearing in the policy (like 
“Doctor” in Policy 1 of Section [lj must be encrypted be¬ 
fore the audit process starts, so CS can recover the associ¬ 
ation between ciphertext and plaintexts of constants that 
appear in the (publicly known) privacy policy. Similarly, 
in Eunomia K , the hashes of constants in policies must be 
revealed to the adversary, in a set 

Security and functionality goals. (Confidentiality) Our 
primary goal is to protect the confidentiality of the log’s 
content, despite any compromise of CS, including its infras¬ 
tructure, employees, and the audit process running on it. 
(Expressiveness) Our system should be expressive enough to 
represent and audit privacy policies derived from real legis¬ 
lation. In our evaluation, we work with privacy rules derived 
from HIPAA and GLBA. 

Log equivalence. Central to the definition of the end-to- 
end security property that we prove of our Eunomia DET and 
Eunomia KH is the notion of log equivalence. It characterizes 
what information about the database remains confidential 
despite a complete compromise of CS. Our security defi¬ 
nition states that the adversary can only learn that the log 
belongs to a stipulated equivalence class of logs. The coarser 
our equivalence, the stronger our security theorem. 

For semantically secure encryption, we could say that two 
logs are equivalent if they are the same length. When the 
encryption permits join, selection, comparison and dispaced 
comparison queries, this definition is too strong. For exam¬ 
ple, the attacker must be allowed to learn that two constants 
on the log (e.g., Doctor and Nurse) are not equal if they lie 
in different columns that the attacker can try to join. Hence, 
we need a refined notion of log equivalence, which we for¬ 
malize in Section IOI 

3. POLICY AND LOG SPECIFICATIONS 

We review the logic that we use to represent privacy poli¬ 
cies and give a formal definition of logs (databases). These 
definitions are later used in the definition and analysis of 
our encryption schemes and the ereduce audit algorithm. 

Policy logic. We use the guarded-fragment of first-order 
logic introduced in [T] to represent privacy policies. The 
syntax of the logic is shown in Figured] Policies or formulas 
are denoted p. Terms t are either constants c, d drawn from 
a domain V or variables x drawn from a set Var. (Func¬ 
tion symbols are disallowed.) t denotes a list of terms. The 
basic building block of formulas is atoms, which represent 
relations between terms. We allow three kinds of atoms. 
First, p(ti, ..., t „) represents a relation which is established 
through a table named p in the audit log. The symbol p 
is called a predicate (or, interchangeably, a table). The set 
of all predicate symbols is denoted by P. An arity function 
a : P —► N specifies how many arguments each predicate 


takes (i.e., how many columns each table has). Second, for 
numerical terms, we allow comparison after displacement 
with constants, written timeOrder(ti, di, t 2 , d 2 ). This rela¬ 
tion means that ti + di <£2 + ^ 2 . Here, di,d 2 must be 
constants. Third, we allow term equality, written ti = t 2 . 
Although we restrict atoms of the logic to these three cat¬ 
egories only, the resulting fragment is still very expressive. 
All the HIPAA- and GLBA-based policies tested in prior 
work m and all but one clause of the entire HIPAA and 
GLBA privacy rules formalized by DeYoung et al. TB] lie 
within this fragment. 

Formulas or policies, denoted tp, contain standard logical 
connectives T (“true”), T (“false”), A (“and”), V (“or”), Va; 
(“for every x”) and 3a; (“for some x”). Saliently, the form of 
quantifiers Va; and 3a; is restricted: Each quantifier must in¬ 
clude a guard, g. As shown in m, this restriction, together 
with the mode check described in Section]?] ensures that au¬ 
dit terminates (in general, the domain V may be infinite). 
Intuitively, one may think of a policy ip as enforcing a con¬ 
straint on the predicates it mentions, i.e., on the tables of 
the log. A guard 3 may be thought of as a query on the log 
(indeed, the syntax of guards generalizes Datalog, a well- 
known database query language). The policy Va :.(g —¥ p) 
may be read as “for every result x of the query 3 , the con¬ 
straint ip must hold.” Dually, 3a :.(g A ip) may be read as 
“some result x of the query 3 must satisfy the constraint ip.'" 

Example 1. Consider the following policy, based on §6802(a) 
of the GLBA privacy law: 

Vpi,p 2 , m, q, a, t. (send(pi,p 2 , m, t) A 
tagged(m, q, a) A activeRole(pi, institution)A 
notAffiliateOf(p 2 ,pi,t) A customerOf(g,pi,t) A attr(a, npi)) 

—> ^(3fi, mi.send(pi, q, mi, ti) A timeOrder(fi, 0, t, 0)A 
timeOrder(t, 0, fi, 30) A discNotice(mi,pi,p 2 , q, a, t)) 

V 

(3f 2 , m, 2 .send(pi, q, m 2 , t 2 ) A timeOrder(f, 0, t 2 , 0)A 

timeOrder(t 2 , 0, t, 30) A discNotice(m 2 ,pi,P 2 , q, a, t))^ 

The policy states that principal pi can send a message m to 
principal p 2 at time t where the message m contains principal 
q’s attribute a (e.g., account number) and (i) pi is in the 
role of a financial institution, (ii) p 2 is not a third-party 
affiliate of pi at time t, (iii) 3 is a customer of pi at time 
t, (iv) the attribute a is non-public personal information 
(npi, e.g., a social security number) only if any one of the 
two conditions separated by V holds. The first condition 
says that the institution has already sent a notification of 
this disclosure in the past 30 days to the customer q (i.e., 
0 < (t — ti) < 30). The second condition says that the 
institution will send a notification of this disclosure within 
the next 30 days (i.e., 0 < (t 2 — t) < 30). 

Logs and schemas. An audit log or log, denoted £, is a 
database with a given schema. A schema S is a set of pairs 
of the form (tableName, columnNames) where columnNames 
is an ordered list of all the column names in the table (pred¬ 
icate) tableName. A schema S corresponds to a policy tp if 
5 contains all predicates mentioned in the policies tp, and 
the number of columns in predicate p is a(p). 

Semantically, we may view a log £ as a function that given 
as argument a variable-free atom p(t) returns either T (the 
entry t exists in table p in £) or T (the entry does not exist). 


To model the possibility that a log table may be incomplete, 
we allow for a third possible response uu (unknown). In our 
implementation, the difference between uu and _L arises from 
an additional bit on the table p indicating whether or not 
the table may be extended in future. Formally, we say that 
log £1 extends log £ 2 , written C\ > £2 when for every p 
and t, if £2(p(t)) ^ uu, then £i(p(f)) = C 2 {p{t))- Thus, 
the extended log C\ may determinize some unknown entries 
from £ 2 , but cannot change existing entries in £ 2 . 

Our logic uses standard semantics of first-order logic, treat¬ 
ing logs as models. The semantics, written £ |= p, take into 
account the possibility of unknown relations; we refer the 
reader to m for details (these details are not important for 
understanding this paper). Intuitively, if £ |= p, then the 
policy p is satisfied on the log £; if £ y= p, then the policy 
is violated; and if neither holds then the log does not have 
enough information to determine whether or not the policy 
has been violated. 

Example 2. The policy in Example 1 can be checked for vi¬ 
olations on a log whose schema contains tables send, tagged, 
activeRole, notAffiliateOf, customerOf, attr and discNotice 
with 4, 3, 2, 3, 3, 2 and 6 columns respectively. In this 
audit, values in several columns may have to be compared 
for equality. For example, the values in the first columns 
of tables send and activeRole must be compared because, 
in the policy, they contain the same variable pi. Similarly, 
timestamps must be compared after displacement with con¬ 
stants 0 and 30. The log encryption schemes we define next 
support these operations. 


4. ENCRYPTION SCHEMES 

We present our two log encryption schemes, Eunomia DET 
and Eunomia KH in Section FPI and Section 14.31 respectively. 
Both schemes use (as a black-box) a new sub-scheme, mOPED, 
for comparing timestamps after displacement, which we present 
in Section m 

4.1 Preliminaries 

We introduce common constructs used through out the 
rest of this section. 

Equality scheme. To support policy audit, we determine, 
through a static analysis of the policies to be audited, which 
pairs of columns in the log schema may have be tested for 
equality or joined. We defer the details of this policy analysis 
to Section [3 For now, we just assume that the result of this 
analysis is available. This result, called an equality scheme, 
denoted 5, is a set of pairs of the form (pi.ai, P2.a2). The 
key property of 8 is that if, during audit, column ai of table 
pi is tested for equality against column a 2 of table P 2 , then 
(pi.ai, p 2 .a 2 ) £ 5 . 

Policy constants. Policies may contain constants. For in¬ 
stance, the policy of Example 1 contains the constants npi, 
institution, 0 and 30. Before running our audit algorithm 
over encrypted logs, a new version of the policy contain¬ 
ing these constants in either encrypted (for Eunomia DET ) or 
keyed hash (for Eunomia K ) form must be created. Conse¬ 
quently, the adversary, who observes the audit and knows 
the plaintext policy, can learn the encryption or hash of 
these constants. Hence, these constants play an important 


role in our security definitions. The set of all these policy 
constants is denoted C. 

Displacement constants. Constants which feature in the 
2nd and 4th argument positions of the predicate timeOrder() 
play a significant role in construction of the mOPED en¬ 
coding and our security definition. These constants are 
called displacements, denoted D. For instance, in Example 
1, D = {0, 30}. For any policy, D CC. 

Encrypting timestamps. We assume (conservatively) that 
all timestamps in the plaintext log may be compared to each 
other, so all timestamps are encrypted (in Eunomia DET ) or 
hashed (in Eunomia K ) with the same key A'time- This key is 
also used to protect values in the mOPED sub-scheme. The 
assumption of all timestamps may be compared with each 
other, can be restricted substantially (for both schemes) if 
the audit policy is fixed ahead of time. 

4.2 Eunomia DET 

The log encryption scheme Eunomia DET encrypts each cell 
individually using deterministic encryption. All cells in a 
column are encrypted with the same key. Importantly, if 
cells in two columns may be compared during audit (as de¬ 
termined by the equality scheme 5), then the two columns 
also share the same key. Hence, cells can be tested for equal¬ 
ity simply by comparing their ciphertexts. To allow times¬ 
tamp comparison after displacement, the encrypted log is 
paired with a mOPED encoding of timestamps that we ex¬ 
plain later. Note that it is possible to replace deterministic 
encryption with a cryptographically secure keyed hash and 
a semantically secure ciphertext to achieve the same func¬ 
tionality (the keyed hash value could be used to check for 
equality). However, this design incurs higher space overhead 
than our design with deterministic encryption. 

Technically, Eunomia DET contains the following three al¬ 
gorithms: KeyGen DET (l fc , S, 5), EncryptLog DET (£, S, 1C), and 
EncryptPolicyConstants DET (<£, 1C). 

Key generation. The probabilistic algorithm KeyGen DET (-, 
•, •) takes as input the security parameter k, the plaintext 
log schema S, and an equality scheme 5. It returns a key set 
1C. The key set 1C is a set of triples of the form (p, a, k). The 
triple means that all cells in column a of table p must be en¬ 
crypted (deterministically) with key k. The constraints on 1C 
are that (a) if p.a contains timestamps, then k = Ke, me , and 
(b) if (pi.ai, p 2 .a 2 ) £ 5, (pi,ai,fci) £ /C and (p 2 ,a 2 ,fc 2 ) £ 1C, 
then fci = fe. 

Encrypting the log. The algorithm EncryptLog DET (-, ■, ■) 
takes as input a plaintext log £, its schema S, and the key set 
K. generated by KeyGen(). It returns a pair e£ = (eDB,e7~) 
where, eDB is the cell-wise encryption of £ with appropriate 
keys from 1C and eT is the mOPED encoding. 

Encrypting constants in the policy. To audit over logs 
encrypted with Eunomia DET , constants in the policy must 
be encrypted too (else, we cannot check whether or not an 
atom mentioning the constant appears in the encrypted log). 
The algorithm EncryptPolicyConstants DET (-, •) takes as input 
a plaintext policy p, and a key set 1C, and returns a policy 
p' in which constants have been encrypted with appropriate 
keys. The function works as follows: If, in p, the constant c 
appears in the ith position of predicate p, then in p' , the ith 
position of p is c deterministically encrypted with the key of 


the ith column of p (as obtained from 1C). Other than this, 
ip and ip' are identical. 

Remarks. The process of audit on a log encrypted with 
Eunomia DET requires no cryptographic operations. Com¬ 
pared to an unencrypted log, we only pay the overhead 
of having to compare longer ciphertexts and some cost for 
looking up the mOPED encoding to compare timestamps. 
However, auditing for a policy that requires equality tests 
beyond those prescribed by an equality scheme 4 is impos¬ 
sible on a log encrypted for <5. To do so, we would have 
to re-encrypt parts of the log, which is a slow operation. 

Our second log encryption scheme, Eunomia , represents a 
different trade-off. 

4.3 Eunomia KH 

Eunomia KH relies on the adjustable keyed hash (AKH) 
scheme (341 [33] to support equality tests. We review AKH 
and then describe how we build Eunomia KH on it. 

Abstractly, AKH provides three functions: Hash(fc,v) = 

P x DET(fc mas ter, v) x fc (P is a point on an elliptic curve, 
DET(-, •) is the deterministic encryption function, and fc maste r 
is a master encryption key), Token(fci, k 2 ) = fc 2 x kf 1 and 
Adjust(w, A) = w x A. Hash(fc,v) returns a keyed hash of 
v with key fc on a pre-determined elliptic curve with public 
parameters. Token(fci,fc 2 ) returns a token A kl ^k 2 , which 
allows transforming hashes created with key fci to corre¬ 
sponding hashes created with fc 2 . The function Adjust(u), A) 
performs this transformation: If w = Hash (fci, v) and A = 

A k 1 ^>k 2 1 then Adjust(u;, A) returns the same value as Hash(fc 2 , v). 
The AKH scheme allows the adversary to compare two val¬ 
ues hashed with keys fci and fc 2 for equality only when it 
knows either A kl ^k 2 or Afc 31 _ f fc 1 . Popa et al. prove this 
security property, reducing it to the elliptic-curve decisional 
Diffie-Hellman assumption 1341 . 

To encrypt a log in Eunomia K , we generate two keys 
kh,k e for each column. These are called the hash key and 
the encryption key, respectively. Each cell v in the col¬ 
umn is transformed into a pair (Hash(fc/j, v), Encrypt(fc E , v)). 
Here, Hash(fc/ l ,u) is the AKH hash of v with key kh and 
Encrypt(fc e , v) is a standard probabilistic encryption of v 
with key fc e u Columns do not share any keys. If audit on 
a policy requires testing columns ti.ai and t 2 .a 2 for equality 
and these columns have hash keys kh 1 and kh 2 , then the 
audit algorithm is given one of the tokens A khl ^k h2 and 
A k h *+k h ■ The algorithm can then transform hashes to 
test for equality. Each execution of the audit process can be 
given a different set of tokens depending on the policy being 
audited and, hence, unlike Eunomia DET , the same encrypted 
log supports audit over any policy. However, equality test¬ 
ing is more expensive now as it invokes the Adjust() function. 
This increases the runtime overhead of audit. 

Formally, Eunomia KH contains the following four algorithms: 
KeyGen KH (l K , 5), EncryptLog KH (£, S, 1C), EncryptPolicyConsta 
nts KH (:p, 1C), and GenerateToken(<S, 5,1C). 

Key generation. The probabilistic algorithm KeyGen KH (-, ■) 
takes as input a security parameter and a log schema S 
and returns a key set JC. 1C contains tuples of the form 

1 The Encrypt(fc e , v) component of the ciphertext is returned 
to the client Cl as part of the audit output. Cl then decrypts 
it to obtain concrete policy violations. 


(p, a, kh, fce), meaning that column p.a has hash key kh and 
encryption key fc e . The only constraint is that if p.a contains 
timestamps, then kh = -Kti me . 

Encrypting the log. The algorithm EncryptLog KH (-, •, ■) 
takes as arguments a plaintext log C , its schema S and a 
key set 1C. It returns a pair e£ = (eDB,eT) where, eDB is 
the cell-wise encryption of C with appropriate keys from 1C 
and eT is the mOPED encoding. Because each cell maps 
to a pair, each table has twice as many columns in eDB as 
in C. 

Encrypting policy constants. To audit over Eunomia KH 
encrypted logs, constants in the policy must be encrypted. 
The algorithm EncryptPolicyConstants KH (-, •) takes as input 
a plaintext policy ip, and a key set 1C, and returns a policy 
< p' in which constants have been encrypted with appropriate 
keys taken from /C: If constant c appears in the ith position 
of predicate p in ip and the hash and encryption keys of the 
ith column of p in /C are kh and fc e , respectively, then the 
constant c is replaced by (Hash(fch, v), Encrypt(fc e , v)) in tp'. 

Generating tokens. GenerateToken(-, •, •) is used to gen¬ 
erate tokens that are given to the audit algorithm to enable 
it to test for equality on the encrypted log. For each tu¬ 
ple (p.ai, q.a 2 ) in (5, the algorithm GenerateToken(<S, 5,1C) re¬ 
turns the tuple (p.ai, q.a 2 , Afc 1 ^fc 2 ), where (p,ai,fci,_) £ K. 
and (q, a 2 , k 2 ,J) £ 1C. 

Remark. From the perspective of confidentiality, the same 
amount of information is revealed irrespective of whether the 
audit algorithm (which may be compromised by the adver¬ 
sary) is given A kh ^ k or A kh ^ khi , because each token 
can be computed from the other. However, the actual token 
used for comparison by the audit algorithm can have a sig¬ 
nificant impact on its performance. Consider Policy 1 from 
Section [T] which stipulates that each name appearing in ta¬ 
ble Ti appear in T 2 with the role Doctor. The audit process 
will iterate over the names in Ti and look up those names in 
T 2 . Consequently, for performance, it makes sense to index 
the hashes of the names in T 2 and for the audit algorithm to 
use the token A kl ^ k2 , where fci and k 2 are the hash keys of 
names in Ti and T 2 , respectively. If, instead, the algorithm 
uses Ak 2 *+k 1 , then indexing is ineffective and performance 
suffers. The bottom line is that directionality of information 
flow during equality testing matters for Eunomia KH . Our 
policy analysis, which determines the columns that may be 
tested for equality during audit (Section [3 takes this di¬ 
rectionality into account. The equality scheme 5 returned 
by this analysis is directional (even though the use of 5 in 
Eunomia DET ignored this directionality): if (pi.ai, P 2 .a 2 ) £ S, 
and pi.ai and P 2 .a 2 have hash keys fci and k 2 , then the audit 
algorithm uses the token A kl ^k 2 , not A k2 ^ kl . 

4.4 Mutable Order Preserving Encoding with 
Displacements (mOPED) 

We now discuss the mOPED scheme which produces a 
data structure, eT, that allows computation of the boolean 
value ti + di < t 2 + d 2 on the cloud, given only Enc(ti), 
Enc(t 2 ), Enc(di) and Enc(d 2 ). Here, Enc(t) denotes the de¬ 
terministic encryption of t (in the case of Eunomia DET ) or 
the AHK hash of t (in the case of Eunomia K ) with the fixed 



key A'time- The scheme mOPED extends a prior scheme 
mOPE [32], which is a special case di = d ,2 = 0 of our 
scheme. 

Consider first the simple case where the log L and the 
policy ip are fixed. This means that the set T of values of 
the form t + d that the audit process may compare to each 
other is also fixed and finite (because t is a timestamp on 
the finite log C and d £ D is a displacement occurring in 
the finite policy <p). Suppose that the set T has size N 
(note that N £ 0(|D| ■ |£|). Then, the client can store on 
the cloud a map eT : EncTimeStamp x EncD —>■ {1,..., N}, 
which maps each encrypted timestamp Enc(t) and each en¬ 
crypted displacement Enc(d) to the relative order of t + d 
among the elements of T. To compute t\ + d\ < t 2 + fife, the 
audit process can instead compute eT(Enc(ti), Enc(di)) < 
eT(Enc(t 2 ), Enc(d 2 ))- The map eT can be represented in 
many different ways. In our implementation, we use nested 
hash tables, where the outer table maps Enc(f) to an inner 
hash table and the inner table maps Enc(d) to the relative 
order of t+d. For audit applications where the log and policy 
are fixed upfront, this simple data structure eT suffices. 

The scheme mOPED is more general and allows the 
client to incrementally update eT on the cloud. This is 
relevant when either the policy or the log changes often. A 
single addition or deletion of t or d can cause the map eT to 
change for potentially all other elements and, hence, a naive 
implementation of eT may incur cost linear in the current 
size of T for single updates. Popa et al. show how this cost 
can be made logarithmic by interactively maintaining a bal¬ 
anced binary search tree over encrypted values Enc(t) and 
using paths in this search tree as the co-domain of eT. We 
extend this approach by maintaining a binary search tree 
over pairs (Enc(f), Enc(d)), where the search order reflects 
the natural order over t + d. Since the cloud never sees 
plaintext data, the update of this binary search tree and the 
map eT must be interactive with the client. We omit the 
details of this interactive update and refer the reader to [32. 
for details. As the cloud may be compromised, the security 
property we prove of mOPED (Section 15.11) holds despite 
the adversary observing every interaction with the client. 
We note that an audit algorithm never updates eT, so its 
execution remains non-interactive. 

5. SECURITY ANALYSIS 

We now prove that our schemes Eunomia DET , Eunomia KH 
and mOPED are secure. We start with mOPED, because 

Eunomia DET and Eunomia KH rely on it. 

5.1 Security of mOPED 

We formalize the security of mOPED as an indistin- 
guishability game in which the adversary provides two se¬ 
quences of timestamps and a set of displacements D, then 
observes the client and server construct the mOPED data 
structure eT on one of these sequences chosen randomly and 
then tries to guess which sequence it is. We call this game 
IND-CDDA (indistinguishability under chosen distances with 
displacement attack). This definition is directly based on the 
IND-OCPA (indistinguishability under ordered chosen plain¬ 
text attack) definition by Boldyreva et al. 0 and the LoR 
security definition by Pandey and Rouselakis m Because 
eT intentionally reveals the relative order of all timestamps 
after displacement with constants in D, we need to impose 
a constraint on the two sequences chosen by the adversary. 


Let u[i] denote the ith element of the sequence u. We say 
that two sequences of timestamps u and v are equal up 
to distances with displacements D, written EDD(u, v, D) iff 
|u| = |u| andVd, d' £ D,i,j. ( u[i]+d > u[j]+d') (v[i\+d > 

v[j] + d'). We describe here the IND-CDDA game and the 
security proof for mOPED with deterministic encryption; 
the case of mOPED with AKH hashes is similar. 

IND-CDDA game. The IND-CDDA security game between 
a client or challenger Cl (i.e., owner of the audit log) and 
an adaptive, probabilistic polynomial time (ppt) adversary 
Adv for the security parameter k proceeds as follows: 

1. Cl generates a secret key A'time using the probabilistic 
key generation algorithm KeyGen. A'time KeyGen(l"’). 

2. Cl chooses a random bit b. b £- {0,1}. 

3. Cl creates an empty eT on the cloud. 

4. Adv chooses a set of distances D = {di,... ,d n } and 
sends it to Cl. 

5. Cl and Adv engage in a polynomial of k number of 
rounds of interactions. In each round j: 

(a) Adv selects two values Vj and vj and sends them 
to Cl. 

(b) Cl deterministically encrypts the following n + 1 
values v b , v b + di , v b + d ?,..., vj + d n using A'time • 

(c) Cl interacts with the cloud to insert DET(A'time, vj ) 
and {DET( A'time, + di)}" =1 into eT. The ad¬ 
versary observes this interaction and the cloud’s 
complete state, but not Cl’s local computation. 

6 . Adv outputs his guess b' of b. 

Adv wins the IND-CDDA security game iff: 

1. Adv guesses b correctly (i.e., b = &'); 

2 . EDD([«o, ■ • ■, oj,], [uq, ..., v$„], D) holds, where m is the 
number of rounds played in the game. 

Let win Adv, ' t be a random variable which is 1 if the Adv 
wins and 0 if Adv loses. Recall that a function / : N —> 
R is negligible with respect to its argument k, if for every 
c £ N there exists another integer K such that for all k > 
K, f(n) < x~ c . We write negl(«:) to denote some neglible 
function of k. 

Theorem 1 (Security of mOPED with deterministic 
encryption) Assuming that deterministic encryption is a 
pseudorandom function, our mOPED scheme is IND-CDDA 
secure, i.e., Pr[win Ad ''’' t = X] < \ + negl(/t) where the proba¬ 
bility is taken over the random coins used by Adv as well as 
the random coins used in choosing the key and the bit b. 

Proof. By a hybrid argument. We augment a similar 
proof of security for the mOPE scheme [32] to also take 
displacements D into account. □ 

Security of mOPED with AKH hash. The security 
game for mOPED with AKH hashes is very similar to 
IND-CDDA. We replace DET(-, •) with Hash(-, •) in the game. 
The proof is in the standard model and reduces to the secu¬ 
rity of AKH (341 Definition 4] and finally to the elliptic-curve 
decisional Diffie-Hellman (ECDDH) assumption. 


5.2 Security of Eunomia DET 

We prove security for Eunomia DET , formalized as an in- 
distinguishability game. We first define a notion of log 
equivalence that characterizes the confidentiality achieved 
by Eunomia DET (and, as we explain later, by Eunomia KH ). 
This notion is a central contribution of our work. The se¬ 
curity theorem in this section shows that by looking at the 
Eunomia DET encryption of a log, a PPT adversary can learn 
only that the log belongs to its equivalence class (with non- 
negligible probability). Hence, the equivalence class of the 
log represents the uncertainty of the adversary about the 
log’s contents and, therefore, characterizes what confiden¬ 
tiality the scheme provides. 

Definition 1 (Plaintext log equivalence) Given two plain¬ 
text audit logs C\ and £2, an equality scheme 5, a set of 
constants C and a set of displacements D C C, £1 and £2 
are equivalent, denoted by £1 =( s , c , d ) £2, if and only if all 
of the followings hold: 

1. £1 and £2 have the same schema and tables of the 
same name in £1 and £2 have the same number of 
records (rows). 

2. For each equivalence class of columns defined by 5, 
there is a bijection from values of £1 to values of £ 2 . 
(By equivalence class of columns defined by 5, we mean 
an equivalence class of columns defined by the reflexive, 
symmetric, transitive closure of 5.) For a table t and a 
column a, let _M t ,a denote the bijection corresponding 
to the equivalence class of 5 in which (t, a) lies. Let v 
be the value in some row i of the table t, column a in 
£ 1 . Then, 

(a) The value in the ith row of table t, column a in 
£2 is Mt,a(v). 

(b) If v G C, then Ai t ,a(c) = c. 

(c) M = |Ai t ,a(w)|. 

3. Let timeStamps(£i) be the sequence of timestamps in 
£1 obtained by traversing the tables of £1 in any or¬ 
der and the timestamps within each table in row order. 
Let timeStamps(£ 2 ) be the timestamps in £2 obtained 
similarly, traversing tables in the same order. Then, 
EDD(timeStamps(£i), timeStamps(£ 2 ), D ) holds. 

We now intuitively explain the requirements of two plain¬ 
text log £1 and £2 to be equivalent according to our defi¬ 
nition. The first requirement of our definition requires both 
£1 and £2 to have the same schema, same table names, and 
same number of records. The second requirement states 
that for each equivalence class of columns defined by the 
reflexive, symmetric, transitive closure of S, there exists a 
bijection mapping M from the plaintext values appearing 
in those columns in £1 to the plaintext values appearing 
in those columns in £2 such that: (a) All constant values 
c appearing in those columns (i.e., c G C) are mapped to 
each other; (b) If there is a mapping between two values p\ 
to P 2 according to A4, then pi and P 2 are of same length; 

(c) If we take any arbitrary row i in any arbitrary column 
j (where j belongs to the table T and the equivalence class 
in question) in £1 and assume £i.T[i][)] = vi, then if we 
apply M over vi, we will get the value of £ 2 ■£[*][?], an( i 
vice versa. The final requirement demands that if we take 
any two arbitrary displacements di,d 2 G D, any two arbi¬ 
trary timestamps t},tj from £ 1 , and the values of the cor¬ 
responding cells in £2 are t1,tj, then the following holds: 
t\ T di t(j T d2 ^4* t( T di t ^ T d2. 


We now define the IND-CPLA det game (Chosen Plaintext 
Log Attack), which defines what it means for two logs to be 
indistinguishable to an adversary Adv under Eunomia DET . 
IND-CPLA det game. The IND-CPLA det game is played be¬ 
tween a client or challenger Cl and an adversary Adv for all 
large enough security parameters k. 

1. Adv picks a log schema S, the sets C, D and an equality 
scheme 5 and gives these to Cl. 

2. Cl probabilistically generates a set of secret keys K, 
based on the sufficiently large security parameter k, 

the log schema S, and the equality scheme S. K. <— 
KeyGen DET (l K ,<S,i5). 

3. Cl randomly selects a bit b. b «— (0,1}. 

4. Adv chooses two plaintext audit logs £0 and £1 such 
that both £ 0 , £1 have schema S, Co =( s , c , d ) £ 1 , £0 ^ 

£ 1 , and sends Co, £1 to Cl. 

5. Following the scheme Eunomia DET , Cl deterministically 
encrypts Cs according to the key set K, to obtain the 
encrypted audit log eDB(,. It then constructs the mOPED 
data structure eTb. Adv may observe the construction 

of the mOPED data-structure eTb passively. Cl then 
sends (eDBj,,e7b) to Adv. 

((eDBb, eTb) <— EncryptLog DET (£6, S, K).) 

6 . For any constant c G C, if c appears in table t, column 
a of Cb, then Cl gives Adv the encryption of c with the 
encryption key of column a. 

7. Adv runs a probabilistic algorithm that may invoke the 
encryption oracle on keys from 1C but never asks for the 
encryption of any value in Co or C\. 

8 . Adv outputs its guess b' of b. 

Adv wins the IND-CPLA det game iff b = b 1 . Let the ran¬ 
dom variable win^y be 1 if the Adv wins and 0 otherwise. 

Theorem 2 (Security of Eunomia DET ) If deterministic en¬ 
cryption is a pseudorandom function. Eunomia DET is IND-CPLA det 
secure, i.e., for any ppt adversary Adv and sufficiently large 
k, Pr[wiriQEy = 1] < i + negl(fv) where the probability is taken 
over the random coins used by Adv as well as the random 
coins used in choosing keys and the random bit b. 

Proof. By hybrid argument. We successively replace 
uses of deterministic encryption with a random oracle. If 
the Adv can distinguish two consecutive hybrids with non- 
negligible probability, it can also distinguish a random oracle 
from a pseudorandom function, which is a contradiction. □ 

Intuitively, this theorem says that any adversary cannot 
distinguish two equivalent logs if they are encrypted with 
Eunomia DET , except with negligible probability. 

5.3 Security of Eunomia KH 

We now define and prove security for the log encryption 
scheme Eunomia KH . The security game, IND-CPLA K , is sim¬ 
ilar to that for Eunomia DET and uses the same notion of log 
equivalence. The proof of security for Eunomia KH reduces to 
the ECDDH assumption. 

IND-CPLA kh game. The IND-CPLA KH game is played be¬ 
tween a challenger Cl and an adversary Adv for all large 
enough security parameters k. It is very similar to IND-CPLA det 
security game but has the following differences. All the en¬ 
cryption is done using the Eunomia KH approach. Addition¬ 
ally, there is one more step after step 5 where the Cl gener¬ 
ates the token list A according to 5 and send its to the Adv. 


We do not show the IND-CPLA KH game due to space con¬ 
straint. Adv wins the IND-CPLA KH game iffb = b'. Let the 
random variable wiriKH V be 1 if the Adv wins and 0 otherwise. 

Theorem 3 (Security of Eunomia KH ) If the ECDDH as¬ 
sumption holds and the encryption scheme used in Eunomia KH 
is IND-CPA secure, then Eunomia KH is IND-CPLA KH secure, 
i.e., for any ppt adversary Adv and sufficiently large k, the 
following holds: Pr[wiriKH V = 1] < ^ + negl(ft), where the 
probability is taken over the random coins used by Adv as 
well as the random coins used in choosing keys and the ran¬ 
dom bit b. 

Proof. By hybrid argument, we reduce to the IND-CPA 
security of encryption and the security of AKH [341 Defini¬ 
tion 4]. The latter relies on the ECDDH assumption. □ 

Generalizing security definitions. IND-CPLA det and 
IND-CPLA kh security definitions are presented as a game in 
which the adversary and the challenger interact for a single 
round. However, it is possible to lift our current security def¬ 
initions for polynomial round of interactions between the ad¬ 
versary and challenger. To achieve this, we need the require¬ 
ment that for two consecutive rounds ( i and i +1 ) of interac¬ 
tions where Co, C\ are adversary-chosen plaintext equivalent 
logs at step i and £q +1 , £] +1 are adversary-chosen plaintext 
equivalent logs at step i + 1, £q +1 > Cq and £] +1 > C\. 

Information leakage. Our security definitions state 
that the only information leaked by our scheme is which 
log equivalence class a particular plaintext audit log belongs 
to. Note that, a log’s membership to an equivalence class is 
a symbolic representation of all information that may leak 
by Eunomia. Precisely, by analyzing the log equivalence def¬ 
inition (Definition [l]) it is possible to infer all information 
that may leak by Eunomia. For instance, according to Defini¬ 
tion!]] tw° equivalent plaintext logs have the same frequency 
distribution in any column of the log. However, two logs 
in different equivalence class may have different frequency 
distribution in any column. Hence, Eunomia leaks the fre¬ 
quency distribution in any column. 

6. AUDITING ALGORITHM 

We now present our auditing algorithm ereduce, which 
is an enhancement of its plaintext counterpart reduce El- 
We choose to enhance reduce as it supports a rich set 
of policies including HIPAA and GLBA. Further, reduce 
has support for incompleteness in the audit log. We write 
ereduce DET to denote the audit algorithm instance for logs 
encrypted under Eunomia DET and ereduce KH for Eunomia KH - 
encrypted audit logs. The main difference between reduce 
and ereduce KH / DET is that ereduce KH ^ DET requires the spe¬ 
cial mOPED data-structure to evaluate the timeOrder pred¬ 
icate whereas reduce directly checks the linear integer con¬ 
straint. We will describe ereduce KH in detail and point out 
the difference between these two instances. 

6.1 Auxiliary Definitions 

A substitution a is a finite map that maps variables to 
value, provenance pairs. Each element of the range of a sub¬ 
stitution is of form ( v,i ) in which v refers to the value that 
the variable is mapped to and I refers to the provenance 
of v. The provenance i refers to the source of the value 
and is of form p.a where p represents a table name and a 


represents a column name. We commonly write a substitu¬ 
tion a as a finite list of elements, each element having the 
form: {x, v h ,v e ,£). For any variable x in cr’s domain, we use 
cr(a;).hash, a{x). cipher, and a(x).£ to select the hash value 
{i.e., v h ), the ciphertext value {i.e., v e ), and the provenance 
{i.e., £), respectively. 

We say substitution <ji extends 02 (denoted <ti > 02 ) if 
cr 1 agrees with all variable mapping in 02 ’s domain. Given a 
substitution a we define [cr] = {(x, tb.cl)|3u.cr(x) = (u,tb.cl)}. 
We use a f X, where X C domain(cr), to denote the substi¬ 
tution a' such that a > a' and domain of a' only contains 
variables from the sequence X. We can lift the ], operation 
for a set of substitutions E. We use • to denote the identity 
substitution. We say a substitution a satisfy a formula g on 
a log eC if replacing each free variable x in g with concrete 
value o{x). hash results in a formula that is true on eC. 

6.2 Algorithm 

Key functions of ereduce is summarized below. 

ereduce K {eC, ip, A, cr) is the top level function that takes 
as input an Eunomia KH encrypted audit log eC, a constant 
encrypted policy ip, a set of tokens A, and an input substi¬ 
tution cr, and returns a residual policy ip. ip represents a for¬ 
mula containing the part of the original policy p that cannot 
be evaluated due to incomplete information in eC. We use 
• as the input substitution to the initial call to ereduce KH . 
ereduce (resp., esat and esat) evaluates the input formula 
from left to right, respecting operator precedence. 

esat {eC, g, A, cr) is an auxiliary function used by ereduce KH 
while evaluating quantifiers to get all finite substitutions 
that satisfy a formula. It takes as input an Eunomia KH en¬ 
crypted audit log eC, a constant encrypted formula g, a set 
of tokens A, and an input substitution cr, and returns all 
finite substitutions for free variables of g that extends a and 
satisfy g with respect to eC. 

esat KH (e£, p(t) , A, a) is an auxiliary function used by esat 
for evaluating all finite substitutions for a given predicate 
with respect to an input substitution. The inputs eC, A, 
and cr have their usual meaning. p(t) is a constant encrypted 
predicate. This function returns all finite substitutions for 
free variables of p(t) that extend the input substitution a 
and satisfy p{t) with respect to eC. The implementation 
of esat KH is audit log representation dependent. For our 
case, evaluation of timeOrder predicate esat KH consults the 
mOPED data structure whereas evaluation of other predi¬ 
cates queries the database representing the audit log. 

For ease of exposition, we drop the superscript KH from 
ereduce KH , esat , and esat KH in the discussion below, 
ereduce eagerly evaluates as much of the input policy p 
as it can; in case it cannot evaluate portions of p due to 
incompleteness in eC, it returns that portion of p as part of 
the result. The return value of ereduce is thus a formula in 
our logic ip {i.e., residual formula). Auditing with ereduce 
is an iterative process. When the current log eC is ex¬ 
tended with additional information {i.e., removing some in¬ 
completeness), resulting in the new log e£i {viz., e£i > eC), 
one can call ereduce again with the residual formula ip as 
the input policy and eCi as the input log. 

We present selected cases of ereduce in Figure 0 The 
conditions above the bar are premises and the condition be¬ 
low the bar is the conclusion. We use the notation f (a) JJ. ip 
to mean function f returns ip when applied to arguments a. 
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Figure 2: ereduce description 


When the input formula to ereduce is a predicate p(t) 
(R-P case), ereduce uses a and A to replace all variables 
in p(t) with concrete values (with proper hash adjustments) 
to obtain a new ground predicate p(t'). (A ground predicate 
only has constants as arguments.) Then it consults eC to 
check whether p(t') exists. If e£(p(t')) = uu, indicating the 
log doesn’t have enough information, then ereduce returns 
p(t'). Otherwise, it returns either true or false depending on 
whether there is a row in table p with hash values match¬ 
ing V. For example, let us assume that ereduce is called 
with the input substitution a = [(pi, v£ , *, f.cl),...] and the 
input predicate activeRole(pi, (doctor^,*)) (pi is a variable 
and doctor is a constant). * represents ciphertext and is not 
important for this example. Let us assume that the column 
1 of the activeRole table uses the keys (k 2 , *) whereas in a, 
the hash value mapped to pi is generated using k\. Hence, 
we have to change the value v £ to v^ 2 using the adjustment 
key A k 1 ^k 2 £ A. Then, using the following SQL query we 
check whether a row with the appropriate hash values ex¬ 
ists: “select * from activeRole where columnlHash=Ufc 2 and 
column2Hash=doctor(! 3 ”. If such a row exists, then T is re¬ 
turned; otherwise, J_ is returned. When ereduce is called 
for timeOrder, the same hash adjustment applies before the 
mOPED data structure is consulted. 

In rule R-V- ereduce is recursively called for each of the 
sub-clauses in the disjunction: ip\ (resp., for <£ 2 ) and the 
returned residual formula is V <p 2 - 

When the input formula is of form Vx(g —> <p) (R-V case), 
we first use the function esat (described below) to get all 
substitutions E' for x that extend cr and satisfy g with re¬ 
spect to eC,. As we require policies to pass EQ mode check, 
it is ensured that there is a finite number of such substitu¬ 
tions for x. For each of these substitutions cn £ E', we then 
recursively call ereduce for <p to obtain residual formula ipi. 
Then the returned residual formula is /\i A P where <p' 
ensures that the same substitutions <t; for x are not checked 
again when eC is extended. 

Next, we explain selected rules for esat (presented below) 
with an example. 


E ■<- esat(e£, p(t), A ,0) p 
esat(e£, p(t), A, <x) lj E 

esat(e£, 91, A, a) If E' 

Vo-i £ S'.esat(e£, 92, A, of) fl E* ^ 

esat(e£, 91 A 92, A, o) JJ. (J E; 

i 

Let us assume esat is called with the formula g = p(x) A 
q(x, y) and substitution o = 0 (empty) as input. The S- 
A rule applies and first esat is recursive called on p(x) and 
0 = 0. Now, the rule S-P applies. Here, x is not in the 
domain of a, so the esat function consults eC (i.e., using 
SQL query like: “select * from p”) to find concrete values of x 
to make p(x) true. Let us assume that we get , *) (i.e., k\ 
is used to hash the column 1 of table p). Hence, esat returns 
the substitution cri = [(x, v jjf , *,p.l)] as output. Going back 
to the S-A rule, now the second premise of S-A calls esat 
for q(x, y) with each substitution obtained after evaluating 
p(x), in our case, eri. Let us assume that column 1 and 2 
of table q is hashed with key fe and k%, respectively. While 
evaluating, q(x,y) with 0 1 , S-P rule is used, cri already 
maps variable x but with key fci, thus esat converts Vk 1 to 
Vk 2 using token A p .i-» 9 .i. It then tries to get concrete values 
for y (with respect to given value of x) by consulting table q 
in eC using the following SQL query: “select column2Hash, 
column2Cipher from q where columnlHash=Ufc 2 ”. Assume 
the SQL query returns (Wk 3 ,*) for column2 (i.e., y), esat 
returns the substitution [(x, , *,p.l), ( y , Wk 3 ,*, q . 2)]. 

Differences between ereduce under Eunomia DET and 
Eunomia KH As we have shown, ereduce KH tracks the prove¬ 
nance of encrypted data value required for audit. This is 
not required by ereduce DET . For ereduce DET , the substi¬ 
tution simply cr maps variables to deterministic ciphertext. 
Further, in the R-P and S-P rule, no adjustment is needed. 

6.3 Properties 

We have proved the correctness of both ereduce DET and 
ereduce KH . We show the theorem for ereduce KH below. 
Theorem [4] states that the decrypted result of ereduce and 
the result of reduce are equal with high probability. The 
results may not be equal due to hash collisions. Due to 
space requirement we do not show the proof here. Here, 
EncryptSubstitution KH function encrypts a plaintext substitu¬ 
tion with provenance to an encrypted one and is very similar 
to the EncryptPolicyConstants KH function (see Section T4.311 . 

Theorem 4 (Correctness of ereduce KH ) For all plain¬ 
text policies ipp and ipp, for all constant encrypted policies ifE 
and ipE, for all database schema S, for all plaintext audit logs 
C = (DB,T), for all encrypted audit logs eC = (eDB.eT), 
for all plaintext substitution ap, for all encrypted substitu¬ 
tion oe, for all xi, f° r oil equality scheme 5, for all security 
parameter n, for all encryption keys K., for all token list A, if 
all of the following holds: (1) \i F <Pp ■ S, (2) [crp] 3 XA (3) 
K, = KeyGen KH (K, S), A = GenerateToken(<S,<5,/C), (4) eC = 
EncryptLog KH (£, S, 1C), (5) ipE = EncryptPolicyConstants KH ((pp 
, K.), (6) AKH key adjustment is correct, (7) erg = EncryptSub 
stitution KH (crp, K.), (8) ipp = reduce(T, crp, </3p), (9) ipE = 
ereduce KH (e£, y?E, A, cr E ), and (10) ipp = DecryptPolicyConst 
ants KH (^E, 1C), then tp p = ip' P with high probability. 

The correctness theorem for ereduce DET states that the 
decrypted result of ereduce and the result of reduce are 







equal. We omit the formal statement of the correctness the¬ 
orem for ereduce DET . 

The security of ereduce follows directly from Theorems 
[2] and [3] As the security theorem does not restrict what 
kind of computation the adversary runs in polynomial time, 
ereduce can be viewed as one instance of such computation. 
Hence, ereduce does not leak any additional information. 


7. EQ MODE CHECK 

We now present EQ mode check, which extends mode 
check [2] introduced in logic programming. The EQ mode 
check is a static analysis of the policy that serves two pur¬ 
poses: (i) It ensures that ereduce algorithm terminates for 
any policy that passes the check and (ii) It outputs the equal¬ 
ity scheme 5 of the policy, which is used in both Eunomia DET 
and Eunomia KH (see Section 0. The EQ mode check runs 
in the linear time of the size of the policy. The EQ mode 
check extends mode check described in Garg et al. EZI by 
additionally carrying provenance and key-adjustment infor¬ 
mation. Next, we introduce modes, then we explain our EQ 
mode checking rules. 

Mode specification. The concept of “modes” comes from 
logic programming [2]. Let us use the following predicate 
as an example: Predicate tagged (m,q,a) is true when the 
message m is tagged with principal q’s attribute a. Assum¬ 
ing the number of possible messages in English language is 
infinite, the number of concrete values for variables m, q, 
and a for which tagged holds is also infinite. However, if 
we are given a concrete message {i.e., concrete value for the 
variable m), then the number of concrete values for q and a 
for which tagged holds is finite. Hence, we say the predicate 
tagged’s argument position 1 is the input position (denoted 
by “+”) whereas the argument positions 2 and 3 are output 
argument positions (denoted by We call such a de¬ 

scription of input and output specification of a predicate its 
mode specification. Mode specification of a given predicate 
signifies that given concrete values for variables in the input 
position, the number of concrete values for variables in the 
output position for which the given predicate holds true is fi¬ 
nite. Hence tagged(m + , q~, a - ) is a valid mode specification 
whereas tagged {m~, q~, a + ) is not. 

EQ mode checking. EQ mode check uses the mode spec¬ 
ification of predicates to check whether a formula is well- 
moded. EQ mode check has two types of judgements: xi F g 
g : {xo,S } for guards, and \i F V 3 '■ 8 for policy formulas. 
Each element of the sets \i and xo is a pair of form ( x, p.a) 
which signifies that we obtained concrete value for the vari¬ 
able x with provenance p.a {i.e., source of the value). 

The top level judgement \i F V 3 : 8 states that given 
ground values for variables in set xi , the formula p is well- 
moded and that audit ip would require the equality checking 
for column pairs given by 5. We call a given policy ip well- 
moded if there exists a 5 for which we can derive the following 
judgement: {} b ip : S. The judgement I- uses h g as a sub¬ 
judgement in the quantifier case and we explain h g first. The 
judgement \i F g 9 '■ {Xo,8) states that: Given concrete 
values for variables in the set Xh the number of concrete 
values for variables in the set \o ( Xo is a subset of the free 
variables of g) for which the formula g holds true is finite. 
It also outputs the column pairs which may be checked for 
equality during evaluation of g. 


VA: € I{p).tk £ Var — >• t*, £ FE(xr) 
Xo = Xi U U 


jeO(p)At. f 6VarAt.,-£FE(x.r) 
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Figure 3: Selected xi F g g : (xo,8) judgements 


Selected mode checking rules for guards are listed in Fig¬ 
ure [3] We explain these rules using an example. We show 
how to check the following formula g = (p{x~) \/q{x~, z~)) A 
r{x + ,y~) with xi = {} (xi = {} signifies that we do not 
have concrete values for any variables yet). The function / 
(resp., O) takes as input a predicate p and returns all in¬ 
put (resp., output) argument positions of p. For instance, 
I{r) = {1} and 0{r) = {2}. 

First, G-CONJ rule (pf A gif) applies. The first premise 
of G-CON.J requires that = ( p{x~ ) V q{x~ ,z~)) is well 
moded with xi = {}■ Now, G-DISJ (gf V gif) rule can 
be used to check the well modedness of g £ with xi — {}• 
The first and second premise require gf = p{x~) and gf = 
q(x~,z~) to be independently well moded with the input 
Xi = {}• While evaluating p{x~) with xi = {} we see 
the G-PRED rule is applicable. The first premise checks 
whether all input variables of p, none in this case, are in¬ 
cluded in xi': this is trivially satisfied. We use an auxiliary 
function FE for checking this, defined as follows: FE(xr) = 
{ x | 3p ,i.(x, p.i) £ Xi}- After evaluating p, we will get con¬ 
crete values for variable(s) in output position of p {i.e., x 
in this case with provenance p.l), hence xo = {(a:,P-1)}, 
which is expressed in premise 2. Finally, we did not have 
concrete values for x in xi: so we would not need any com¬ 
parison, hence <5 = {} (premise 3). Similarly, we can derive, 
{} F g q{x~,z~) : ({(x,g.l), (z,q. 2)}, {}). Once we have de¬ 
termined both gf and gf are well moded, we see that we are 
only guaranteed to have concrete value for variable x (if gf 
is true we will not get concrete value for z) but x can either 
have provenance p.l or q. 1. We have to keep track of both, 
which is captured using the ffil operator defined as follows: 
Xi(nlX 2 = {(x,pi.ai) | 3p2,a 2 .(((a;,pi.ai) £ Xi A (*,P 2 .a 2 ) £ 
X 2 ) \/{{x,pi.ai) £ X 2 A (x,p 2 .a 2 ) £ Xi))-}- So we have, 
{}F g gi '■ ({(x,p.l), {x, q.l)}, {})■ 

Now let’s go back to the second premise of G-CONJ, 
which requires that r{x + ,y~) is well-moded with respect 
to x — {(a;,p.l), (x, <?-l)}. The G-PRED rule is applicable 
again for checking this. The first premise, requiring variables 
in input argument position {i.e., x in this case) are given by 
Xi, is satisfied. According to the second premise, we will ad¬ 
ditionally get concrete value for y with provenance r. 2 hence 
Xo = {(£,p.l), (x,q. 1), {y,r. 2)}. Finally, concrete value for 
x (with provenance p.l and q. 1) is needed while evaluating 
r {x is in input argument position 1 of r), hence we need to 
check for equality between the following column pairs, p.l, 
r.l and q.l, r. 1, i.e., 5 = {(p.l, r.l}, (g.l, r.l)}. The direc¬ 
tionality of the pairs, e.g., (p.l,r.l), matters for efficiency. 
For instance, we are given a concrete value hashed with key 
k p . 1 for variable x. While evaluating r we would need con- 





Crete value of x ( x is in input argument position of r). Now, 
if we are given the token A r .i-> P .i we cannot directly evalu¬ 
ate r without incurring additional computational overhead. 

Top-level mode checking rules for policy formulas are very 
similar to those for guards, except that formulas do not 
ground variables. We show the rule for universal quantifi¬ 
cation below. The audit algorithm checks formulas of form 
Vx.(g —» ip) by first obtaining finite number of substitutions 
for x that satisfy g and then checking whether ip holds true 
for each of these substitution. 

Xi l-g 9 ■ ixo, S g ) x C FE(xo) 

fv{g) C FE(xr) U {£} Xo F p : 5 C 

--- Univ 

Xi F \/x.(g -+tp) :dgUS c 

The first premise of UNIV checks that we have only finite 
number of substitutions for x that satisfy g with respect 
to xi an d with equality scheme 5 g . This is necessary for 
termination while checking universal formulas as the domain 
of the variables can be infinite. We then check whether we 
have substitutions for all quantified variables x and also all 
the free variables of g. Finally, we inductively check whether 
(p is well-moded with respect to the ground variables we 
obtained while evaluating g with equality scheme 5 C . Then, 
the resulting equality scheme is 5 g U 5 C . 

8. IMPLEMENTATION AND EVALUATION 

We report on our empirical evaluation of ereduce on the 
Eunomia DET and Eunomia KH schemes. We run experiments 
on a 2.67GHz Intel Xeon X5650 CPU with Debian Linux 7.6 
and 50GB of RAM, of which no more than 3.0 GB is used. 
SQLite version 3.8.7.1 is used to store the plaintext and en¬ 
crypted logs. We aggressively index all database columns in 
input argument positions as specified in mode specifications. 
In Eunomia DET , the index is built over deterministically en¬ 
crypted values; in Eunomia K , the index is built over hashed 
values. For deterministic encryption, we use AES with a 
variation of the CMC mode [22] with a fixed IV and a 16 
byte block size. We use 256 bit keys. For the AKH scheme, 
we use the library by Popa et al. m- The underlying el¬ 
liptic curve is the NIST-approved NID_X9_62_primel92vl. 
We use privacy policies derived from the GLBA and HIPAA 
privacy rules and cover 4 and 13 representative clauses of 
these rules, respectively. 
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Figure 4: Experimental results HIPAA 


We use synthetically generated plaintext audit logs. Given 
an input policy and a desired number of privacy sensitive ac¬ 
tions, our audit log generation algorithm randomly decides 
whether each action will be policy compliant or not. To 
generate log entries for a compliant action, the algorithm 
traverses the abstract syntax tree of the policy and gener¬ 
ates instances of atoms that together satisfy the policy. For 
the non-compliant actions, we randomly choose atoms to 
falsify a necessary condition. Our synthetic log generator 
also outputs the mOPED data structure but with plain¬ 
text values for timestamps. We generate logs with 2000 to 
14000 privacy-sensitive actions. Each plaintext log is en¬ 
crypted with both the Eunomia DET and Eunomia KH schemes. 
The maximum plaintext audit log size we considered is 17 
MB. The corresponding maximum encrypted log sizes in 
Eunomia DET and Eunomia KH are 67.3MB and 267MB, respec¬ 
tively. Most of the size of Eunomia KH -encrypted log comes 
from the keyed hashes. 

We measure the relative overhead of running ereduce on 
logs encrypted with Eunomia DET and Eunomia KH , choosing 
reduce on plaintext audit log as the baseline. We exper¬ 
iment with both RAM-backed and disk-backed versions of 
SQLite. We report here only the memory-backed results (the 
disk-backed results are similar). Figure U shows the average 
execution time per privacy-sensitive action for the HIPAA 
policy in all three configurations (GLBA results are similar). 
The number of privacy-sensitive actions (and, hence, the log 
size) varies on the x-axis. The overhead of Eunomia DET is 
very low, between 3% and 9%. This is unsurprising, because 
no cryptographic operations are needed during audit. The 
overhead comes from the need to read and compare longer 
(encrypted) fields in the log and from having to use the 
mOPED data structure. With Eunomia KH , overheads are 
much higher, ranging from 63% to 406%. These overheads 
come entirely from two sources: the cost of reading a much 
larger database and the cost of performing hash adjustments 
to check equality of values in different columns. We observe 
that the overhead due to the increased database size is more 
than that due to hash adjustment. For the policies we exper¬ 
imented with, the per-action overhead due to database size 
grows linearly, but the overhead due to hash adjustments 
is relatively constant. There is substantial room (i.e., 30% 
of the total overhead incurred by Eunomia K ) for improving 
the efficiency of ereduce on Eunomia K , e.g., by caching 
previous key-adjustments, which we currently do not do. 

9. RELATED WORK 

In this section, we briefly review the existing work that 
are most relevant to our approach. 

Functional encryption: Function encryption scheme m 
l20l [28l 130] enables one (possibly with the possession of some 
tokens) to calculate a function value over encrypted argu¬ 
ments. The output of the function is in plaintext unlike ho¬ 
momorphic encryption |18 |. Our approach can be viewed as 
attempting to mimic functional encryption where the func¬ 
tion is compliance checking of a given policy. We have the 
following differences with functional encryption: (i) some 
of the output can be encrypted in our case, (ii) we have 
multiple arguments each of which can be encrypted with 
different keys, and (iii) also we cannot hide the function be¬ 
ing computed as the policy is public knowledge. However, 
Goldwasser et al. ESI introduce multi-input functional en¬ 
cryption in which multiple arguments are considered. 






































Predicate encryption: Property-preserving encryption EH 
or predicate encryption [38ltl2l[25] can be viewed as a special 
case of the functional encryption where the function returns 
boolean value. The compliance checking can be viewed as 
a variation of predicate encryption where the predicate out¬ 
puts ‘0’ for violation of the policy and T’ for satisfaction. 
Traditionally, predicate encryption schemes consider argu¬ 
ments encrypted with a single encryption key whereas in our 
case we have arguments encrypted with multiple encryption 
keys. Pandey and Rouselakis m present several notions of 
security for symmetric predicate encryption and constructs 

one such encryption scheme for the checking the orthogonal- 

? 

ity of two encrypted vectors (x ■ y = 0 mod p). Our security 
definition IND-ECPLA is inspired by their LoR security no¬ 
tion. Shen et al. [38] present a symmetric predicate encryp¬ 
tion scheme which supports inner product queries. Their 
approach can also be used for equality checking but using 
it will result in the database indexing to be unusable in our 
case. They introduce a notion of security called predicate 
privacy. Our approach cannot provide predicate privacy as 
the policy is known to the adversary. 

Structured encryption: Chase and Kamara m intro¬ 
duce structured encryption for structured data and which 
maintains the structure of the data after the encryption. 
They construct encryption schemes for graph structures and 
allow adjacency queries, neighboring queries, and also fo¬ 
cused subgraph queries. Our encryption of the audit logs 
could be viewed as an instance of structured encryption 
where the queries we are interested in, are relevant to com¬ 
pliance checking with respect to a given policy. 

Searchable audit log: Waters et al. m present a frame¬ 
work in which they allow both confidentiality and integrity 
protection with the ability to search the encrypted audit logs 
based on some keywords. They use hash chains for integrity 
protection and use identity-based encryption [TO] with ex¬ 
tracted keywords to enable searching the audit log [3]. In 
our current work, we only consider confidentiality of the data 
and assume existence of complementary techniques to ensure 
integrity of the audit log [351l36|[26ll24l . The policies against 
which we check the audit log is more expressive than what 
they consider. Additionally, we require time-stamp compar¬ 
ison which is their framework does not allow. 

Searchable encrypted audit log: Waters et al. present 
a framework in which they provide both confidentiality and 
integrity protection with the ability to search the encrypted 
audit logs based on keywords [41]. They use use identity- 
based encryption [TO] with extracted keywords to enable 
searching the audit log [9j. We only consider confidentiality 
of the data and assume existence of complementary tech¬ 
niques to ensure integrity of the audit log [36]. The policies 
we consider are more expressive than theirs and we support 
time-stamp comparison which is not present in their frame¬ 
work. 

Order-preserving encryption: Boldyreva et al. [8] present 
a symmetric encryption scheme that maintains the order of 
the plaintext data which does not satisfy the ideal IND-OCPA 
security definition. Popa et al. present the mOPE scheme 
which we enhance to support timestamp comparison with 
displacements [32]. Recently, Kerschbaum and Schropfer 
present a keyless order preserving encryption scheme for out¬ 
sourced data j27j. In their approach, the owner of the plain¬ 
text data is required to keep a dictionary of mapping from 
plaintext to ciphertext which is undesirable in our scenario. 


Querying outsourced database: Hacigiimu§ et al. m 
develop a system that allows querying over encrypted data. 
Their strategy is to ask the client to decrypt data to enable 
operations on the encrypted data. Tu et al. :40] introduces 
split client/server query execution for processing analytical 
queries on encrypted databases. Eunomia does not require 
any query processing on the client side as it is untrusted. 
Damiani et al. [lb] developed a secure indexing approach for 
querying an encrypted database. We do not require modifi¬ 
cation to the indexing algorithm of the DBMS. 

cryptDB developed by Popa et al. [33] allows queries over 
encrypted databases. Their goal is to leave the front-end of 
the application running on the client side, untouched and 
relying on a trusted proxy to perform appropriate encryp¬ 
tions and decryptions of the queries and the results. Our 
setting is different in that our application runs in the cloud 
without requiring the existence of a trusted proxy. Unlike 
cryptDB, Eunomia supports a restricted set of SQL queries 
for auditing. We prove end-to-end security guarantees of our 
algorithm, which has not been done for cryptDB. 

Privacy policy compliance checking: Prior work on 
logic-based compliance checking algorithms focus on plain¬ 
text logs [IE] [7] 02101]. We are the first to use logic-based 
approach to check encrypted logs for policy compliance. 
Auditing retention policies: Lu et al. [29] presents a 
framework for auditing the changes to database with re¬ 
spect to a retention policy. They also consider retention 
history being incomplete. However, they work on plaintext 
audit log. To support their queries Eunomia needs to be 
enhanced. 

10. SUMMARY 

We presented an auditing algorithm that checks compli¬ 
ance over encrypted audit logs with respect to an expressive 
class of privacy policies. We introduced a novel notion of au¬ 
dit log equivalence that enables us to obtain an end-to-end 
security definition that precisely captures the information 
leakage during the auditing process. We then prove secure, 
two instances of the auditing algorithm which differ in the 
encryption scheme, under this definition. Empirical evalua¬ 
tion demonstrates that both instances of our algorithm have 
low to moderate overhead compared to a baseline algorithm 
that only supports plaintext audit logs. 
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